Stable Architectures for Deep Neural Networks

نویسندگان

Eldad Haber

Lars Ruthotto

چکیده

Deep neural networks have become valuable tools for supervised machine learning, e.g., in the classification of text or images. While offering superior results over traditional techniques to find and express complicated patterns in data, deep architectures are known to be challenging to design and train such that they generalize well to new data. An important issue that must be overcome is numerical instabilities in derivative-based learning algorithms commonly called exploding or vanishing gradients. In this paper, we propose new forward propagation techniques inspired by systems of ordinary differential equations (ODE) that overcome this challenge and lead to well-posed learning problems for arbitrarily deep networks. The backbone of our approach is interpreting deep learning as a parameter estimation problem of nonlinear dynamical systems. Given this formulation, we analyze stability and wellposedness of deep learning and use this new understanding to develop new network architectures. We relate the exploding and vanishing gradient phenomenon to the stability of the discrete ODE and present several strategies for stabilizing deep learning for very deep networks. While our new architectures restrict the solution space, several numerical experiments show their competitiveness with state-of-the-art networks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration of remote sensing and meteorological data to predict flooding time using deep learning algorithm

Accurate flood forecasting is a vital need to reduce its risks. Due to the complicated structure of flood and river flow, it is somehow difficult to solve this problem. Artificial neural networks, such as frequent neural networks, offer good performance in time series data. In recent years, the use of Long Short Term Memory networks hase attracted much attention due to the faults of frequent ne...

متن کامل

Analysis of Deep Convolutional Neural Network Architectures

In computer vision many tasks are solved using machine learning. In the past few years, state of the art results in computer vision have been achieved using deep learning. Deeper machine learning architectures are better capable in handling complex recognition tasks, compared to previous more shallow models. Many architectures for computer vision make use of convolutional neural networks which ...

متن کامل

Reversible Architectures for Arbitrarily Deep Residual Neural Networks

Recently, deep residual networks have been successfully applied in many computer vision and natural language processing tasks, pushing the state-of-the-art performance with deeper and wider architectures. In this work, we interpret deep residual networks as ordinary differential equations (ODEs), which have long been studied in mathematics and physics with rich theoretical and empirical success...

متن کامل

Overview of Deep Neural Networks

In recent years, new neural network models with deep architectures started to get more attention in the field of machine learning. These models contain larger number of layers (therefore ”deep”) than conventional multi-layered perceptron, which usually uses only two or three functional layers of neurons. To overcome the difficulties of training such complex networks, new learning algorithms hav...

متن کامل

Exploring the Imposition of Synaptic Precision Restrictions For Evolutionary Synthesis of Deep Neural Networks

A key contributing factor to incredible success of deep neural networks has been the significant rise on massively parallel computing devices allowing researchers to greatly increase the size and depth of deep neural networks, leading to significant improvements in modeling accuracy. Although deeper, larger, or complex deep neural networks have shown considerable promise, the computational comp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1705.03341 شماره

صفحات -

تاریخ انتشار 2017

Stable Architectures for Deep Neural Networks

نویسندگان

چکیده

منابع مشابه

Integration of remote sensing and meteorological data to predict flooding time using deep learning algorithm

Analysis of Deep Convolutional Neural Network Architectures

Reversible Architectures for Arbitrarily Deep Residual Neural Networks

Overview of Deep Neural Networks

Exploring the Imposition of Synaptic Precision Restrictions For Evolutionary Synthesis of Deep Neural Networks

عنوان ژورنال:

اشتراک گذاری